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Abstract 

^ Large and moderate deviation probabilities play an important role in many 

applied areas, such as insurance and risk analysis. This paper studies the exact 
moderate and large deviation asymptotics in non-logarithmic form for linear 
processes with independent innovations. The linear processes we analyze are 
general and therefore they include the long memory case. We give an asymp- 
totic representation for probability of the tail of the normalized sums and specify 
I the zones in which it can be approximated either by a standard normal distribu- 

^-H tion or by the marginal distribution of the innovation process. The results are 

^ then applied to regression estimates, moving averages, fractionally integrated 

• ^ processes, linear processes with regularly varying exponents and functions of 

linear processes. We also consider the computation of value at risk and ex- 
^ pected shortfall, fundamental quantities in risk theory and finance. 
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1 Introduction and notations 



Let (^i)igz be a sequence of independent and identically distributed centered 
random variables and c„i a sequence of constants. This paper focuses on the 
moderate and large deviations in non-logarithmic form for the linear process of 
the form 

5'n = ^ ^ Cjii^i- (1) 
i=l 

This class of linear processes is versatile enough to help analyzing regression es- 
timates, moving averages that include long memory processes, linear processes 
with regularly varying coefficients, fractionally integrated processes, and func- 
tions of linear processes. 

Our goal is to find an asymptotic representation for the tail probabilities 
of the normalized sums defined by ([T]). Estimations of deviation probabilities 
occur in a natural way in many applied areas, so for instance, in problems of 
insurance in the context of large claim insurance. 

Specifically, we aim to find a function Nn{x) such that, as n — > oo, 

^"'^"^ =1 + 0(1), where a^ = ||5„||^ = VcL. (2) 



Nnix) 



If x > is fixed, then ([2| becomes the well-known central limit theorem by let- 
ting Nn{x) = 1 — $(2:), where ^(x) is the standard normal distribution function. 
In this paper we call P(S'„/fT„ > x) the moderate or large deviation probabilities 
depending on the speed oi x ^ 00. These tail probabilities of rare events can 
be very small. Here we call ([2| the exact approximation, which is more accurate 
and holds under less restrictive moment conditions than the logarithmic version 

logP(^„/a„ >x) 

logiV„(.) + 

For example, suppose P(S'„/cr„ > x) — 10^^ and Nn{x) = 10~^; then their 
logarithmic ratio is 0.8, which does not appear to be very different from 1, while 
the ratio for the exact version ^ is as big as 10. A multiplicative factor of this 
order can cause substantially different industrial standards in designing projects 
that can survive natural disasters. The logarithmic version ^ is incapable of 
effectively characterizing the differences between the tail probabilities. 

As early as 1929, Khinchin considered the problem of moderate and large de- 
viation probabilities in non-logarithmic form for independent Bernoulli random 
variables. The first large deviation probability result appeared in S. Nagaev 
(1965). A. Nagaev (1969) studied large deviation probabilities of i.i.d. random 
variables with regularly varying tails. Mikosch and A. Nagaev (1998) applied 
the large deviation probabilities for heavy-tailed random variables to insurance 
mathematics. The review work on this topic can be found in S. Nagaev (1979) 
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and Rozovski (1993). Rubin and Sethuraman (1965), Slastnikov (1978) and 
Frolov (2005) considered the moderate or large deviations for arrays of indepen- 
dent random variables. S. Nagaev (1979) presented the following very useful 
result: in ([T]) assume fc„ = n, Cni = 1, and that has a regularly varying right 
tail. i.e. 

h(x) 

]P(Co > a;) = ^7^(1 + o(l)) as X oo for some t > 2, (4) 

where h{x) is a slowly varying function (Bingham, Goldie and Teugels, 1987). 
Namely, linix^oo h{Xx) / h{x) = 1 for all A > 0. Note that dn — \fn. If in 
addition, for some p > 2, has absolute moment of order p, then 

n 

P(^ i^ > xan) = (1 - *(a;))(l + o(l)) + nP(^o > a;CT„)(l + o(l)) (5) 
1=1 

for n — 7> C!0 and a: > 1. Note that (|5| implies ([2| with 

Nn{x) = (1 - ^{x)) + 7iP(^o > xa,,). (6) 

Hence if 1 - $(x) = o[nP(^o > xun)] (resp. nP(^o > xan) = o(l - ^{x))), then 
in ([2]) we can also choose Nn{x) = 1 — ^{x) (resp. Nn{x) = nP(^o > a^Cn))- 

The study of moderate and large deviation probabilities in non-logarithmic 
form for dependent random variables is still in its initial stage. Ghosh (1974) 
considered moderate deviations for m-dependent random variables. Chen (2001) 
obtained moderate deviation result for Markov processes. Grama (1997) and 
Grama and Haeusler (2006) investigated the martingale case. Wu and Zhao 
(2008) studied moderate deviations for stationary processes which applies to 
many time series models. However the result in the latter paper can only be 
applied to linear processes with short memory and their transformations. 

For analyzing linear processes with long memory and for obtaining other in- 
teresting applications, we study processes of type ([T]) . Under mild conditions on 
the coefficients, we shall point out the zones in which the deviation probabilities 
can be approximated either by a standard normal distribution or by using the 
distribution of fo- Our main result is that ([5| holds in our case with 

k„ 

N„{x) = (1 - $(a;)) + £p(c„^o > xan). 

The paper has the following structure. Section [2] presents a general moderate 
and large deviation result and various applications. Section [3] illustrates the 
results of a numeric study. In Section [4] we prove the results. In the Appendix 
we give some auxiliary results and we also mention some known facts needed 
for the proofs. 

Before stating our results we introduce the notations used throughout this 
paper: a„ bn means that lim„^oo an/bn = 1, = 0(&„) and also a„ <C bn 
means limsup„^oo a„/6„ < 00; a„ = o(6„) if lim„^oo an/^n = 0. By ||X||p 
we denote (E|X|p)^/p. The notation /(•), h{-) and £{■) denote slowly varying 
functions. 



3 



2 Main Results 



It is convenient to normalize by the variance and throughout the paper, we 
assume that: 

Condition A. (Ci)iez, are i.i.d. centered random variables with finite second 
moment, E(Co) = 1- 



2.1 General linear processes 

Our first results apply to general linear processes of type ([T]) with i.i.d. innova- 
tions. For Cni > and i > we define 

k„ 

S„* = £c^, (7) 

1=1 

al = var{Sn) = S„2, (8) 

and 

D„t = B-^/^B^f (9) 

The basic assumption in all our results is the uniformly asymptotically negligi- 
bility of the variance of individual summands, namely 

max clJal^O. (10) 

l<t<kn 

Our first theorem extends Nagaev's result in ([s]) to general linear processes. 

Theorem 2.1 Assume that (^i)igz satisfies Condition A, and for a certain 
t > 2 it satisfies the right tail condition Moreover, for a certain p > 2, 



ll^ollp < oo. Assume also that c„i > and [10) is satisfied. Then, as n — )■ oo, 



' (Sn > xan) = (1 + o(l)) J2 IP(cm6 > xa^) + (1 - $(a;))(l + o(l)) (11) 



i=i 

holds for all X > 0. 



Corollary 2.1 Under the conditions of Theorem 2.1 for x > a(ln Z?„j^)^/^ with 
a > 2^1"^ we have 

P(S'„ > xan) = (1 + o(l)) ^ P{cmCo > xan) as n ^ oo. (12) 

i=l 

On the other hand, if < x < 6(lnZ3^/)^/^ with b < 2^/^, we have 

P {Sn > xan) = (1 - Hx)){l + o(l)) asn^oo. (13) 



4 



Remark 2.1 Notice that 
then M) holds with 



Nn{x) 

For the special case in which lini; 

Nn{x) 



12) suggests that if x > a{lnDj^^Y^'^ where 



, h{x) 



T ^ 71 
2 — 1 



ho > 0, we can also choose 

— —rUnt- 



X 



(14) 



Corollary 2.2 Assume that (^i)igz satisfies Condition A, and for a certain 
t > 2 it satisfies 



As a consequence to Theorem |2.1| we have 



x^ 



o(l)) as X 



(15) 



Assume also that Cni > and (10) is satisfied. Then the conclusions of Theorem 
2.1\ and Corollary\2.1\ are valid. 



Notice that ( 12 ) and ( 13 1 assert different approximations for the tail prob- 
abihty P(S'„ > a;cr„): moderate behavior for x smaher than a threshold, when 
we can approximate this probability by using a normal distribution. On the 
other hand we have a large deviation type of behavior for x larger than another 
threshold, where Sn exceeds a level because essentially one of the summands is 
large. 

The proofs of these results are based on a separate study of the behaviors 
of type (12 1 or (13), which is of independent interest. As a matter of fact, we 
shall see in the next two theorems that a result similar to (12 1 holds without 
the assumption of the finite moment of order p while the moderate deviation 
( 13 ) does not require a regularly varying right tails. 

Theorem 2.2 Assume that satisfies Condition A, and for a certain 

t > 2 it satisfies Let Cni > be a sequence of constants satisfying (10). 
Then, for x > Ct(ln£)~/)^/2 with Ct > e*/^(t + 2)/i/2 the large deviation result 



12) holds. 



As a counterpart to this result we shall formulate now the moderate devi- 
ation bound. Recall that <i>(a;) is the standard normal distribution function. 
Notice that D^t is, up to a factor involving the moments of the innovations, the 
Lyapunov's proportion. 

Theorem 2.3 Assume that (^i)igz satisfies Condition A and for a certain p > 
2; IICollp < oo. Assume that (10) is satisfied. If x^ < 21n(_D,^p) then the 
moderate deviation result |T5i) holds. 
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2.2 Applications to linear regression estimates 



Many random evolutions and also statistical procedures, such as estimation 
of regression coefficients, produce linear statistics of type ([T]). See for instance 
Chapter 9 in Beran (1994), for the case of parametric regression, or the paper by 
Robinson (1997), where kernel estimators are used for nonparametric regression. 
Here we consider the simple parametric regression model Yi = j3ai + ^i, where 
(^i) is an i.i.d. sequence of errors, (a^) is a sequence of positive real numbers 
and /3 is the parameter of interest. The least squares estimator /3„ of /3, based 
on a sample of size n, satisfies 

1 " 

Sn ■■= P71 - P = 2$!"'^*' 

so, the representation of type ([l]) holds with c„i = o-i/i^^i^i af )• Denote A^t = 
Sr=i '^l- Notice that var{Sn) = 1/An2- 
Assume 

lim A,72 max = 0. (17) 

n— ^00 l<i<7i 

As an immediate consequence of Theorem |2.1[ we obtain: 



Corollary 2.3 (i) Assume that (^i)igz satisfies the conditions in Theorem 2.1 
Under assumption (0?l), for x > 0, we have 



P(/3n ^f3> x/Ali') = 

n 

(1 + 0(1)) J2 m > + (1 + o(i))(i - ^i^))- 



1=1 



(a) If X > and x^ < 2\n{^J2 / A^t) , we have 

KPn -P> X/A]l^) = (1 + 0(1))(1 - ^{X)). 

(Hi) lfx>0 and x'^ > Cf \xi{J^ll j A^t) with Cf > 2 then 



x/Ali') = (1 + oil))J2m > xAli'/a.). 

i=l 



Similar results as in Theorem |2 . 2| and Theorem |2.3| can also be easily formu- 
lated. 



Theorems |2.1 2.2 and 2.3 are also applicable to the nonlinear regression 
model yi = g{xi) + ^i, 1 < * < fi, where g{x) is an unknown function and is 
the noise. Let Xi be the deterministic design points. Then the Nadaraya- Watson 
estimate satisfies 



gn{x) - Egn{x) = ^ Cni{x)^i 
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with 

where if is a kernel function and > is a sequence of bandwidths converging 
to 0, and therefore is of the type Q. 



2.3 Application to moving averages 



We now consider the sum Sn = X]fe=i -^k, where 

oo 



(18) 



] = -oo 



We assume that J^iez'^f ^ which is the necessary and sufficient condition 



for the existence of Xi. Observe that Sn — X^i^-oo ^m^i is of form (|l| with 

bnj = ai-] H h an- J. (19) 



Define Dnt by ^ with c„i = &„,;. We know from Peligrad and Utev (1997) that 



under the assumption (t„ 



we have 



(T„ ^ sup as n — > oo . 



2.1 2.21 and 2.3 we obtain 



Therefore condition (|10|) is automatically satisfied and as a corollary of Theo- 
rems 



Corollary 2.4 Assume that {Xn)n>i is defined by (18) and a. 



(i) Assume that (^i)igz satisfies the conditions of Theorem 2.1 and hnk ^ 0; 
then ^11\ ) holds. 

(a) Let (Ci)iez be as in Theorem 2.2. Assume hnk ^ 0; then the 



_ ^ deviation 

result (12) holds. 

(Hi) Assume *s as in Theorem 2.3: then the moderate deviation result 

\llKj is valid. 



Notice that this corollary applies to general linear processes including the 
long memory processes with \ai\ = oo. Asymptotic properties for long mem- 
ory processes can be quite different from those of processes with short memory, 
partially because the variance of the partial sum goes to infinity at an order dif- 
ferent than n; see for example, Ho and Hsing (1997), Robinson (2003), Doukhan, 
Oppenheim and Taqqu (2003) among others. Hall (1992) gave a Berry-Esseen 
bound for the convergence rate in the central limit theorem. 

We shall apply now this corollary to the important particular case of causal 
long-memory processes with 



Oi = l{i + 1)(1 + i) '', i > 0, with 1/2 < r < 1, and = in rest. 



(20) 
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Here /(•) is a slowly varying function where the results can be given in a more 
precise form. Notice that in this particular case 



j = -oo 

Let ao — 1. This case of long memory linear processes covers the well-known 
fractional ARIMA processes (cf. Granger and Joyeux, 1980, Hosking, 1981), 
which plays an important role in financial time series modeling and application. 
As a special case, let < d < 1/2 and B be the backward shift operator with 
Bsk = Efc-i and consider 

Xk = {l- By^^k = where a, = ^J^^-^^^y 



For this example we have lim„ 



d-1 



cesses have long memory because ^ 



j>0 



= l/r(c?). Notice that these pro- 
oo. 



Coro llar y 2.5 Assume {20). If(S,i)iez satisfies the conditions of Theorem 2.1 



then 



11 ) holds. In particular (12) holds for x > ci(lnn)-'^/^ with ci> (t — 2)^ 



while (13) holds, provided Q < x < C2(lnn)^/^ with C2 < {t — 2)^/^. 



For this case Theorems |2.2| and |2.3| give: 



Corollary 2.6 (i) Let (^i)igz be as in Theorem 2.2 Then (12) holds for x > 
ci(lnn)i/2 with ci>(t- 2)i/2et/2(i + 2)/2 



(ii) Assume is as in Theorem 

(p- 2)(lnn 



2.3 



Then (13) holds, provided x < 



2.4 Application to risk measures 



In risk theory and finance, value at risk (VaR) and expected shortfall (ES) 
play a fundamental role; see Jorion (2006), Holton (2003), McNeil et al (2005), 
Acerbi and Tasche (2002) among others. Mathematically, they are equivalent to 
quantiles and tail conditional expectations. In practice one is most interested 
in their extremal behavior which corresponds to tail quantiles. Despite their 
importance, however, their computation can be quite difficult and the related 
asymptotic justification is far from being trivial. 

Here we shall apply Theorem |2.1| and provide approximate formulae for ex- 
tremal quantiles and tail conditional expectations for S'„. Under the asssump- 



tion lim2;_j.oo h(x) ho > 0, by (14) and Theorem 2.1 



V{Sn > xan) = (1 + o(l))^i?„t + (1 - $(.t))(1 + o(l)). 

Given the tail probability a £ (0,1), let qa^n be the upper a-th quantile of 
Sn- Namely P(S'„ > qa,n) = ct- Elementary calculations show that Qa^n can be 
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approximated by Xaan in the sense that hm„^oo XaCTn/qa.n ~ 1, where x = Xa 
is the sohition to the equation 

^D.„t + (1 - Hx)) = a. 
x^ 

In p articular, if a < hoDntia^ ^nD^^)^*^^^ with a > 2^/^, then, by CoroUary 



2.1 



we can approximate q^.n by (h^Dnt/ oiY^*'(^n = (-B„t/io/<^)^^*- The approx- 
imation is understood in the sense that (Bnt^Ql o)^^^ I qa,n — >■ 1 as n — )■ oo, and 
the tail conditional expectation or expected shortfall is computed as 



E(5'„|S'„ > qa.: 



ga,„P(5„ > ga,„) + /^^ ^ P(5„ > w)dw 



P(5„ > 



i/tt{ho/ayf' 

We emphasize that, without the exact moderate deviation principle of Theorem 
|2.1[ the validity of the above equivalence cannot be guaranteed. To the best of 
our knowledge, our example is one of the very few cases that one can obtain an 
explicit asymptotic expression for VaR and ES for sums of dependent random 
variables. 



2.5 Functionals of linear processes 



In this subsection we shall use the result from the point (ii) of Corollary 2.6 to 
study the moderate deviation for nonlinear transformations of linear processes. 
Let K he a transformation which is measurable and ¥,K{Xo) = 0. Let 



For example, if K{Xo) = I{Xo < r) — P{Xo < r), then Hn/n becomes the em- 
pirical process. If Xi is a short memory linear process, namely are absolutely 
summable and their sum is different of 0, then we can apply the moderate devi- 
ation principle in Wu and Zhao (2008). However, the result in the latter paper 
is not applicable for long-range dependent processes. Despite its importance in 
risk analysis, the problem of moderate deviation under strong dependence has 
been rarely studied in the literature. 

Here we shall establish such a principle in the context of nonlinear transforms 
of linear processes. First, we introduce some necessary notation for this section. 
Let J^n = (■ ■ ■ 5 ^ji-i: Cri) bc the shift process and define the projection operator 
■pj. = E(•|J^^) — E(•|J^i_l). Denote the truncated processes Xn,k — lE(X„|J"fc). 
Now define the functions Kn{w) — E[if (w + X„ — X„,o)] and Kooiw) ~ E[K{'w + 
Xn)]- We consider transformations K with k := K'^{0) ^ 0. Define 



][K{X,) - kX,] =Hn- nSn, where 5„ = ^ X, 



El 

1=1 i=l 
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Then Hn = KSn + Sn,i- For a function g, let g{w] A) = sup|j^|<;^ Igi^ + y)] be the 
local maximal function. Denote the collection of functions with second order 
partial derivatives by C^(K). We need the following regularity condition. 

Condition B. Let 2 < q < p < 2q and assume \\£,o\\p < oo. Assume 
Kn e C^(M) for all large n and that for some A > 0, 



2 

El 

1=0 



/4^l(X„,o; A)||, + |||6r/'A;,_i(X„a)ll, + = 0(1). 



A version of Condition B with q = 2 is used in Wu (2006). We shall establish 
the following moderate deviation result. For 1/2 < r < 1 and 1/2 < w < 1 
define 

x(f , r) — i'max(r — r/v, 1/2 — r, r — 1), 
u{r) = argmini/2<t,<iX(«,'') and p{r) = -x(a;(r),r). 

Theorem 2.4 Assume that Condition B holds with q = pLu(r) and the condi- 
tions of Corollary 2.5 (ii) are satisfied. Let c be such that < c < p — 2 and 
c < 2pp{r). Then if x < clnn, we have 

V{Hn > \K\(J.nX) = (1 - $(x))(l + o(l)) flS 71 ^ OO. (21) 

Remark 2.2 As mentioned in the proof of Theorem 2.4 in Section \4.7\ ( 21) is 
still valid if the normalizing constant |k| cr„ therein is replaced by ^ var{Hn). 

Remark 2.3 Theorem \2.4\ only asserts a moderate deviation with the Gaussian 
range. It is unclear whether the approximation of type (12) holds. We pose it 
as an open problem. 

Remark 2.4 An explicit form for uj(r) can be obtained. Ifr > 3/4, thenui{r) = 
r. If r < 3/4, then uj{r) = r/{2r — 1/2). If 2pp{r) > p—2, then the moderate 
deviation in ^21^ has the same range as for Sn ■ The latter happens, for example, 
if r — 3/4 and 2 < p < 16/5, since in this case 2pp(3/A) > p — 2. 

Example 2.1 As an application to empirical processes, let K{X) = I{X < 
t) - V{X < t), where t is fixed. Let X„ = ^„ + Y^TLi ai^n-i + Yn^i, 

where ||Collp < oo, p > 2, and its density function f^ satisfies 

sup[/e(M) + |/^(7.)|] <oo. (22) 

u 

Then Ki{w) = F^(t — w) — Fx{t), where is the distribution function of ^i. 
Under (22), we clearly have s\vp^[\Ki{w)\ + |i4rf(ii;)|] < oo. Observe that we 
have the identity: for n> \, 

Hence sup„ sup^[|if^(w)| + |i4r^'(w)|] < 00. So Condition B holds for any A 
since & LP, p > 2. 
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3 A Numerical Study 



In this section we shall design a numeric study of the accuracy of the large 



deviation (12 1, normal approximation (13) and also the estimate (111 on a fi 



nite simple. In particular, we shall study the accuracy of the approximations in 
Corollary 2.5 In general it is very difficult to calculate tail probabilities by sim- 
ulation, especially if they are small. One may need to carry out astronomically 
large amount of computations to obtain reasonably well approximations. 

Here we shall approach the problem from a different angle. We let Xj = 
Si^i o-i^j-ij where ^i, j S Z, have Student's t-distribution with degree of free- 
dom v = 3, and = i""-^. Let Sn = J27=i'^i with n — 300. Note that the 
characteristic function of is 



iV^\t\r/^K^/,iV:}\t\) 
r(i//2)2''/2-i 



(23) 



where Ki,/2 is the Bessel function (see Hurst (1995)). Then the characteristic 
function of S'„ is 

fS^it) = n V(^niO 

and by the inversion formula, 



P(^„ <X)- ¥{Sn < X') = 



27r 



In the above equation let x' — 0. Since is symmetric, P(S'„ < 0) = 1/2. In 
our numeric study we shall use ( p3| to compute the probability P(S'„ > x). 

In Figure |3] we report the ratios R{x) := X)iP(^m^o ^ x)/f{Sn > x) and 
g{x) (1 — <i>(a;/a-„)/P(S'„ > x); see (12) with c„i = hni- We can interpret 
R{x) (resp. g{x)) as tail (resp. Gaussian) approximation. As expected from 
Corollary |2.5[ the Gaussian approximation is better if x is small, while the tail 
probability R{x) approximation is better when x is big. In the intermediate 
region we approximate by their sum. 



4 Proofs 

4.1 Preliminary approximations 

Let {Xni)i<i<k^ be a triangular array of independent random variables. We 
shall approximate here the tail distribution of partial sums by the tail of the 
sums of truncated random variables and a term involving the tail probabilities 
of individual summands. We implement the following notations: 

Sn — ^ ^ Xni, Sn{j) — ^ ^ X^i 
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500 1000 1500 2000 2500 



Fig. 1. Tail approximation R{x) (dashed curve), Gaussian approximation 
g{x) (solid curve) and their sum (dotted curve) for long-memory processes with 
Student t{3) innovations. 

and for a; > and £ > we set 

fc„ fe„ 
Xif = X„,/(X„i < ex), = ^ Xif and (j) = E ^nf ■ (24) 

i=l 

We shall prove the following key lemma that will be further exploited to approx- 
imate the tail distribution of P(S'„ > x) in terms of the sum of the truncated 
random variables and the tail distributions of the individual summands. 

Lemma 4.1 For any < 77 < 1, and e > such that 1 — rj > e we have 

\nSn >X)- F{St'> >X)- J^nXr^J > (1 - V)X)\ < 

j=l 

fcu fen 

4(£P(X„,- > sx)f + 3j^F{X,,j > ex){F{\Sr,{j)\ > vx) 

fen 

+ £p((l-?7)a;<X„,<(l + r?)a;). 

i=i 

Proof. We start to estimate 5'„ > a; by using the decomposition according 
to maxj^j Xni < ex or max,^j X„i > ex, and the last one can happen if exactly 
one of the variables is larger than ex or at least two variables exceed ex. Formally, 
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k, 



ex, m&xXni < ex) 




= A + B + C = Y,^j+B + C. 



The term B can be easily majorated by 





We analyze now the first term. We introdncc a new parameter 77 > 0. Since for 
any two events A and B we have \P{A) - P{B)\ < P{AB') + P{A'B), (here the 
prime stays for the complement), for each j we have 

\Aj - ¥{Xnj > (1 - r])x)\ < P(S„ > X, X„j > ex, X„j < (1 - r])x) 

+P(X„j > (1 - 'n)x, Sn<x)+ P(X„j > (1 - 'n)x, Xnj < ex) 

+P(X„j > (1 - ri)x, maxX„i > ex) =1 + 11 + III + IV. 

We treat each term separately. By independence and since Sn > x and X^j < 
(1 — r])x imply Sn{j) > tjx, we derive 



II < P((l - 'n)x < X.aj < (1 + Ti)x) + P(X„j- > (1 + ri)x, Sn < x) 

< P((l - 7^)X < Xnj < (1 + V)x) + P{Xnj > (1 + v)x)P{-Snij) > x) . 



Since 1 — rj > e the third term is: /// = 0. By independence, the forth term is 



I < ¥{Xr,j > ex)¥{Sn{j) > vx). 



The second term is treated in the following way: 



IV = F{Xnj > (1 - r])x)F{maxXni > ex). 



Overall, by the previous estimates and because 1 — rj > e, we obtain 
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It remains to analyze the last term, C. Notice that 

\C - P(5i^^' > = P(5i^^' > a;) - P(5^^^) > x, max X„i < ex) 
= P(5^^^) > X, max X^i > sx). 

l<i<kn 

Now we treat this term by the same arguments we have already used, by dividing 
the maximum in two parts: 

P(5^=^) > X, max X^i > sx) = Vp(5^^^) > x, X„,- > ex, maxX„i < ex) 

fe„-l fe„ fen 

The last term, G is majoratcd exactly as B. As for the first term, we notice 
that because Xnj > sx the term x!^^!^^ does not appear in the sum, and by 
independence we obtain 

Fj = ¥{Si''='>{j) > x, X^j > sx, maxX™ < ex) 

<¥{St\j)>x)nXr,j>sx). 

Now, clearly we have 

P(5i"^ni) >x)< P(maxX„i > sx) +¥{Si'='\j) > x, maxX„i < sx) 

i i 

= P(maxX„i > sx) + F{Sn{j) > x, maxX„j < sx), 

i i 

implying that 

fen fen 

J^Fj < J^¥{Xr,j > sx){V{nmxXr,i > ex) +P(5„(i) > x)). 

Overall, 

fc„ /c„ 

|C-P(4^-) > x)\ < 2(^P(X„,- > £X))' + ^P(X„, > sx)F{Sn{j) > x). 

3=1 3=1 

By gathering all the information above and taking into account that 

fcn 

\nSn >X)- P(5(-) >X)- J^nXr^j > (1 - V)X)\ < 

3=1 

fen 

\A - £P(X„,- > (1 - r])x)\ + \C- P(5(-) > x)\ + \B\, 

3 = 1 
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the lemma is established. <> 



If Sn is stochastically bounded, i.e. limif_>oo sup„ Pd^nj > K) = 0, the 
approximation in Lemma |4.1| have a simple asymptotic form. 



Proposition 4.1 Assume 5„ is stochastically bounded, the variables are cen- 
tered, and Xn oo. Then for any < 77 < 1, and e > such that 1 — r/ > e, 



> Xn) = P(5i^"") > Xn) + ^ P(X„j > (1 - rj)Xn) 

J = l 

-o(2p(X„, > £x„)) + £P((1 - v)xn < Xnj < (1 + r;)a:„). 



Proof. We just notice that for independent centered random variables, if 
Sn is stochastically bounded, by Levy inequality (Inequality 1.1.3 in de la Pena 
and Gine 1999), we have maxi<i<fc^ \Xni\ is stochastically bounded too. By 
taking into account that |S'„(j)| < \Sn\ + maxi<j<fc^ l^nili we obtain 



VP(X„, > ex„)P(|5„(j)| > r^Xn) < max P(|5„(j)| > ?7a;„) VP(X„, > exn) 

'^Sj^^n 

J = l 

Sn\ > VXn/2) + P( max |X„,| > 77x„/2) VP(X„j > eXn) 

l<i<kj, j — ^ 



< 



E 

= o(^P(X„j > £x„)) as n — >■ cxD. 
j=i 

Then, by independence 

fen 

P( max > exn) = P(|X„i| > ex„) + Vp( max \Xnj\ < £x„)P(|X„fe| > eXn) 

l<j<kn — l<7<fe— 1 

fc=2 - 

> P( max \Xn,\ < ex„)VP(|X„j| > eXn) 

l<7<fej, — ' 
fe=l 

that gives 

(X:P(|X„,| > exn))^ < nmax,,,,.jX„,|^s.„) ^ ^ 
^ P(maxi<j<fe„ |X„j| < £x„) ^ 

fen 

= o(^P(|X„j| > eXn) as n — > oo, 
since x„ — >■ cx3 and maxi<j<fe^ \Xnj \ is stochastically bounded. <)> 



15 



4.2 Proof of Theorem [272] 

It is convenient to normalize by the variance of partial sum and we shall consider 
without restricting the generality that 

^cli = l and ^max cli ^ 0. (25) 

Then we have J2'i=i '^lii — inaxi<i<fc^ c^~^ — > implying that £)~/ -> oo. More- 
over, the sequence Cni£,i is stochastically bounded and we analyze the last 
three terms in Proposition |4.1[ Let x = Xn — > oo. By and taking into account 
that x/cni > X — >■ oo and ft, is a slowly varying function we derive for any 7 fixed 

X ^ , ^ X 



2 — 1 



2^ c^.M— )(i hhrnr-) — ) = ^(z^ ^mK—)). 

■i—l 2—1 

implying that 

E-=iP(cn.C.> (li'y)^) _ (i + oi(i))E-=i«(i±^)a:/c™) 



^ 1 



E£l P(Cn^6 > ^) (1 ± + 02(1)) E-=l <,/l(a;/c„ 

when n — 7- 00 followed by r/ 0. 

Then, we also have 

— ^— r >■ as n — 00 and 77 — 0. 

Similarly, for every e > fixed we have that 

Eti lP(c™6 > ex) _ (1 + oi(l))(l + 03(1)) 1 

i — T7Z 7TT\ ^ — as 77 — >■ CXD. 

E-=iP(c™6>^) ^*(l + 02(l)) £* 

and then, 

P(c„iCi > ex) < E P(c„i6 > x) as n -> 00. 

So far, for any e > fixed, by letting n — > 00 first and after that, passing with 77 
to 0, we deduce by the above consideration combined with Proposition |4. 1 1 that 



V(Sn>x)='^¥{cni^i>x){l + o{l))+V{Si''''^>x) 8.8 U OO. (26) 

4=1 

It remains to study the term V{Sn^^ > x). We shall base this part of the proof 
on Corollary 1.7 in S. Nagaev (1979), given in the Appendix, which wc apply 
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with m > t, that will be selected later. Because we as sume E(^q) = 1, we have 
for all y, B^^{—oo,y) < 1, and therefore, Theorem 5.1 implies: 

P(S'i^^) >x)< exp(-a2a;2/2e™) + (A(m; 0, ex)/(/3e'"-ia;™))^/" . 

with a = 1 — /3 = 2/ (to + 2). Then, obviously, it is enough to select x — >■ oo and 
e > such that 

eM~) + ( ^pl^Y' = o fv . (27) 

Let a; = x„ = Cn[\n{D~J)]'^/'^ where C„ > e™/^(TO + 2)/y/2. As we mention at 
the beginning of the proof x ^ oo. 

We shall estimate each term in the left hand side of ( 27 ) separately. Because, 
by the definition of a we have Cn > e'^/^a-^Vi, we can select < 77 < 1 such 
that C^aVSe'" = (1 - ?7)"^- 

Taking into account the fact that for any c > and d > we have y'^ exp(— cy) 
o(exp(— c(l — ri)y) as y — ?■ 00, by the definition on x and 77, we obtain: 

exp(-^) = o(exp(-^g;(l - ,)) 
fc„ fc„ 



1=1 1=1 
Applying now the Holder inequality we clearly have. 



E 4. = E ^ (E -™)''(E (28) 



Taking into account that X]f=i '^ni ~ obtain overall 



exp(-^) = o ^-it-2n)/ii-v)Y^J^7^n)/(i-v) 



2e 



It remains to notice that because i > 2, we have [t — 2ri) /(I — ry) > t. Then, by 
combining this observation with the properties of slowly varying functions we 
have 

exp(-^) = ofv%M — )V 



We select e by analyzing the second term in the left hand side of (27 1. Notice 
that by integration by parts formula, for every z > y > 0, 

EC/(0 < 6 < = 

-z"P(Co >z) + m [ u™"^P(Co > u)du <y"' + m [ u™"^P(Co > u)du. 

Jo Jy 
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Replacing z — ex/cni, taking into account condition (|4]), the properties of slowly 
varying functions, and the facts that x/cni — >■ oo and m > t, we easily obtain 
for y sufficiently large 



It follows that 

A{m- 0, ex) = ^ CE^o™/(0 < c„,eo < ex) 
1=1 

« EC( — — ) « a;-* — )■ 

The second term, has the order 



immediately as /3/e > 1. This condition leads to the selection of e with < e < 
/?• 

Overall we obtain for any x = C„(ln(Ejii c^)"^)^^^ with C„ > e"/2(TO + 
2)/V2, 

P(5„ > a;) < (1 + o(l)) E ]P(cmCo > x) as n ^ oo, 
j=i 

where m > t. Since Cf > e^/^{t + 2)/^/2 we can select and fix to > f such that 
Ct > e"'/\m + 2)/V2. 

We combine this result with the lower bound to complete the proof of this 
Theorem. 

4.3 Proof of Theorem [2^31 

This result easily follows from Theorem 1 in Frolov (2005) when moments 
strictly larger than 2 are available. This Theorem is given for convenience 
in the Appendix (Theorem 5.2). Because we assume the existence of moments 
of order p, we have 

U 



An{u,s,e) < -^^cljE(,Ql{\cnj^o\ > ecTn/s) 

< ^^^^ E i^^ji^Eieor = ^'-'•usp-^Lr. 
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where L„p = ^-^^,=1 ^^^^01"- Then, for < (2 ln(l/L„p)), 



The proof is immediate from Theorem 5.2 Just notice that by (10 1 

|p-2 



maxi<j<fc„ |c„ 



Then x'^ - 21n(L,7pi) - (p - 1) lnln(L,7pi) ^ -oo provided x^ < 2lii{L-^) 

p/21nln(L~p). It remains to notice that for n sufficiently large < 21n(£'7p] 
lnln(Z3~p) and the result follows. 

4.4 Proof of Theorem 12.11 



For simplicity wc normalize by the variance and assume (25 1. Without restrict 



ing the generality we assume 2 < p < t. We start from inequality (26 1 and 
apply Proposition |5.1| to the second term in the right hand side. We obtain for 
any e > and x'^ < Celn(Z?,7p) with < 1/e and for all n sufhciently large 



ns'n 



> X) 



(1 - $(a;))(l + o(l)). We notice now that by (281 applied with 



rj = {t ~ p)/{t — 2) and simple considerations, 



(29) 



So far, by using this last relation, we practically showed that (11) holds for 



< X < C[ln(Z?„/)]^/^ with C an arbitrary positive n umb er. On the other hand, 
because (1 — $(a;)) < a;^^ exp(— x^/2), by Theorem 2.2 and by the arguments 



leading to the proof of relation (27), there is a constant c > such that for 
X > c[ln(_D~/)]^/^, we simultaneously have 



' {Sn > X) = (1 + 0(1)) P(c™Co > X) 



and 



(l-$(a;))=o(^P(c„,eo>x)). 



1=1 



Then (11) holds for all x > since C is arbitrarily large and can be selected 



such that c < C. 



4.5 Proof of Corollary 2.1 

The ideas involved in the proof of this corollary already appeared in the previous 



proofs, so we shall mention only the changes. We start from ( 11 ). To prove ( 12 ) 
we have to show that 



1 - 
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for X > a(lnD-/)i/2 with a > 2^1"^. First we shall use the relation 1 — $(x) < 
exp(— a;^/2). Then, we adapt the proof we used to establish the first part of 
(27), when we compared exp(— a^a;^/2e'") to X]i^=i lP(CniCo > 2;(t„). The main 
difference is that now we take m = and a = \. 

For the proof of ( 13 ), we use the inequality 1 — <i>(a:) > (1 + exp(— x^/2). 
By Q and ( 29 ) we have for every < e < t — 2, 



[Dnt] 



(t-2-e)/(t-2) 



Then, it is easy to see that, because e can be made arbitrarily small, for 1 < 
X < 6(lnD- 1)1/2 with h < 2^/2 ^e have 

fen 

£P(c™Co > X<Jn) = 0(1 - $(.t)). 



When < a: < 1 we apply Theorem 2.3 



4.6 Proof of Corollary 2.5 



This Corollary follows from Corollary |2.4| via Lemma |5.1| in the Appendix. It 
remains to give an explicit form of the intervals moderate deviation and large 
deviation boundaries. For proving the large deviation part of this corollary 
we have to analyze the condition on x fro m pa rt (ii) of Corollary 2.4 namely 
X > a(lnD~/)i/2 ^^ith a = \/2. By Lemma 5.1 

i 

and 



3-2r;2 



l\n) 



fl-r)t+l 



Then, for certain constants Ki and K2 and because = BII2 / Bnt, we have 
for n sufficiently large 

ifi + lnn(*-2)/2 < \nD-} < K2 + Inn^*-^)/^. 



So, the asymptotic result (12 1 holds for x > ci(lnn)i/2 where ci > (t — 2)i/2_ 
Furthermore, (13 1 holds for < a; < C2(lnn)i/2 where C2 < (< - 2)^/"^. 



4.7 Proof of Theorem 12.41 

Without restricting the generality we assume k > 0, since similar computations 
can be done when k < 0. Let An = Y^^n ^1- Using the argument of Theorem 
5 in Wu (2006), under Condition B, we have 

\\ro{K{Xn) - kX„)\\, = O(0„), where e„ = |a„|P/« + K\Al/^. 
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Let = if i < and e„ = Xir=i - Then by Theorem 1 in Wu (2007), there 

: 1 
SuJ' 



exists a constant Bq > 1 such that 



q 



(30) 



'i— n+l 



By Karamata's theorem, Ar, 



2r - ly^n^'^'^liny, and if i > n, Q^+i ~ 



0{n9i) and Y^^n+i^l — 0{n9'^^). Let £(•) be a slowly varying function and 
/3 € M. Again by Karamata's theorem, there exists another slow ly v arying 
function 4(-) such that YJ^^^ = 0{l + n^-^)£Q{n). Hence by (|30]), there 

exists a slowly varying function ^i(-) such that 

||^„,ilU = 0(^)(1 + n'-^P/'^ + ni--+(i-2-)/2)4(n). (31) 

For n > 3 let g„ = (Inn)-'^. Then 

V{Sn > ix+gn)an) - P(i?„ > KXa^) < V{\Sn,l\ > KgnOn). (32) 

Since < c\nn and gn = (Inn)^^, we have that 1 — $(a; ± gn) ^ 1 — $(x). 
Hence by Corollary 2.5 (21) follows from (32) in view of 

P(|^„,l|>-ffna«)<^^^- gUnV2-rli^n)Y ^^^^ 

= ---^'-^4^ - ^ = o^xe-^'^) = o[l - $(,)], 
gnl (n) Inn 

since c/2 < pp{r). Here we note that ii{n)/{gnl{n)) is also slowly varying in n 
and X < clnn. By (|31|) and (33), it is easily seen that the normalizing constant 



KCTn can be replaced by ^/va/r{Hn). The proof of the upper bound is similar and 
it is left to the reader. 



5 Appendix 

The following Theorem is a slight reformulation of Fuk-Nagaev inequality (see 
Corollary 1.7, S. Nagaev, 1979): 

Theorem 5.1 Let Yi, I2, ■ • • -Yn be independent random variables and m > 2. 
Suppose EYi — 0, i — 1, - ■ ■ ,n, l3 — m/{m + 2), and a = 1 — /3 = 2/{m + 2). 
For y>0, define Y^y') = YJiY, < y), A^{m- 0, y) := Eti ^^"^(0 <Y.< y)] 
and B^{-oo, y) := YJLi H^^^iY, < y)]. Then for any x > and y > 

We shall also use the following result which is an immediate consequence of 
Theorem 1.1 in Frolov (2005). 
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Theorem 5.2 Let {X„j)i<j<k„ be an array of row-wise independent centered 
random variables. Let p > 2 and denote Sn = X]^=i ^nj, cfn = ^-^nj ~^ 

oo, M„p = E^nj/(^nj > 0) < oo, L„p = a:^PM,ip and denote 

A„(u,s,e) = -^^EX2^/(X„j < -ecr„/s). 



Furthermore, assume Lnp — > and A„(a; , a; , e) — >■ /or an?/ e > 0. Then if 
a; > and — 21n(L^/) — (t — 1) lnln(i~(^) — oo, we have 

P(5„>a:f7„)-(l-$(x))(l + o(l)). 

For truncated random variables by following the proof of Theorem 1.1 in 
Frolov (2005) we can present his relation (3.17) as a proposition. 



Proposition 5.1 Assume the conditions in Theorem 5.2 are satisfied. Define 

^nk = X„kI{Xnk < exan) and S'^^'^ X'^^. 

Fix £ > 0. Then if x"^ < cln(_L~p) with c < 1/e, for all n sufficiently large we 
have 

P(5;>a;(7„) = (l-$(x)) (1 + 0(1)). 

The following facts about the series are going to be used to analyze a class 
of linear processes: 

Lemma 5.1 Assume Oi = l{i)i~^ with 1/2 < r < 1. Let bj b„j := J2i=i^i 
if 1 < j < n and bnj ■= X]i=j-n+i '^i j ^- Then, for two positive constants 
Ci and C2,we have 

Ci(/*(n)n(i-'-)*+i) < £6^, < C2(?*(n)n(i-'-)*+i), 
for any t > 2. In the case t = 2, X^j^i ^nj ~ CrU^^'^^P^n) with 

/"OO 

c,. = {/ [x^-^- -me.x{x-l,0)^-'-]^dx}/{l~r)^. 
Jo 

Proof. It is easy to see that bnj ^ j^'^Kf) fo'' J — 2n and bnj <C n{j — 
n)^'^l{j) for j > 2n from the Karamata theorem (see part 1 of Lemma 5.4 in 
Peligrad and Sang (2010)). Therefore, 

OO 2n OO 

2n oo 

« E J'^'^'^^'^'O') + E - = 0(;*(n)n(i-'-)*+i). 

i=l j=2n+l 

The proof in the other direction is similar. The result of case i = 2 is well 
known. See for instance Theorem 2 in Wu and Min (2005). 



22 



References 

[1] Acerbi, C. and D. Tasche (2002). On the coherence of expected shortfall. 
J. Banking and Finance 26 1487-1503. 

[2] Bcran, J. (1994). Statistics for long-memory processes. Monographs on 
Statistics and Applied Probability 61, Chapman and Hall, New York. 

[3] Bingham, N. H., C. M. Goldic and J. L. Teugels (1987). Regular Variation. 
Cambridge, UK: Cambridge University Press. 

[4] Chen, X. (2001). Moderate deviations for Markovian occupation times. 
Stochastic Process. Appl. 94, 51-70. 

[5] de la Peha, V. and E. Cine (1999). Decoupling. From dependence to inde- 
pendence. Springer. 

[6] Doukhan, P., G. Oppenlieim and M. S. Taqqu (editors) (2003). Theory and 
Applications of Long-Range Dependence, Birkhauser, Boston. 

[7] Frolov, A. N. (2005). On probabilities of moderate deviations of sums for in- 
dependent random variables. Journal of Mathematical Sciences 127, 1787- 
1796. 

[8] Ghosh, M. (1974). Probabilities of moderate deviations under m- 
dependence. Canad. J. Statist. 2, 157-168. 

[9] Grama, I. G. (1997). On moderate deviations for martingales. Ann. Probab. 

25, 152-183. 

[10] Grama, I. G. and E. Haeusler (2006). An asymptotic expansion for prob- 
abilities of moderate deviations for multivariate martingales. J. Theoret. 
Probab 19, 1-44. 

[11] Granger, C. W. and R. Joyeux (1980). An introduction to long-memory 
time series models and fractional differencing. J. Time Ser. Anal 1 15-29. 

[12] Hall, P. (1992). Convergence rates in the central limit theorem for means 
of autoregressive and moving average sequences. Stochastic Process. Appl. 
43, 115-131. 

[13] Ho, H. C. and T. Hsing (1997). Limit theorems for functionals of moving 
average. Ann. Probab. 25, 1636-1669. 

[14] Holton, Glyn (2003). Value-at-Risk: Theory and Practice. Academic Press. 

[15] Hosking, J. R. M. (1981). Fractional differencing. Biometrika 68, 165-176. 

[16] Hurst, Simon, The Characteristic Function of the Student-t Distribution, 
Financial Mathematics Research Report No. FMRR006-95, Statistics Re- 
search Report No. SRR044-95 



23 



[17] Jorion, Philippe (2006). Value at Risk: The New Benchmark for Managing 
Financial Risk (3rd cd.). McGraw Hill. 

[18] McNeil, Alexander, Frey, Riidiger and Embrechts, Paul (2005). Quantita- 
tive Risk Management: Concepts Techniques and Tools. Princeton Univer- 
sity Press. 

[19] Mikosch, T. and A. V. Nagaev (1998). Large Deviations of Heavy-Tailed 
Sums with Applications in Insurance. Extrem,es 1:1, 81-110. 

[20] Nagaev, A. V. (1969). Limit theorems for large deviations where Cramer's 
conditions are violated (in Russian). Izv. Akad. Nauk UzSSR Ser. Fiz.-Mat. 
Nauk 6, 17-22. 

[21] Nagaev, S. V. (1965). Some limit theorems for large deviations. Teor. Veroy- 
atn. Primen. 10, 231-254. 

[22] Nagaev, S. V. (1979). Large deviations of sums of independent random 
variables. Ann. Probab. 7, 745 789. 

[23] Peligrad, M and H. Sang (2010). Asymptotic properties of self-normalized 
linear processes with long memory, submitted 

[24] Peligrad, M. and S. Utev (1997). Central limit theorem for linear processes. 
Ann. Probab. 25 443-456. 

[25] Robinson, P. M. (1997). Large-sample inference for non parametric regres- 
sion with dependent errors. Ann. Statist. 25, 2054-2083. 

[26] Robinson, P. M. (2003). Time series with long memory, Oxford University 
Press 

[27] Rozovski, L. V. (1993). Probabilities of large deviations on the whole axis. 
Theory Probab. Appl. 38, 53-79. 

[28] Rubin, H. and J. Sethuraman (1965). Probabilities of moderate deviations. 

Sankhya Ser. A 27, 325 346. 

[29] Slastnikov, A. D. (1978). Limit theorems for probabilities of moderate de- 
viations Teor. Veroyatn. Primen. 24, 340 357 

[30] Wu, W. B. (2006). Unit root testing for functional of linear processes. 
Econometric Theory 22, 1-14. 

[31] Wu, W. B. (2007). Strong invariance principles for dependent random vari- 
ables. Annals of Probability 35, 2294-2320. 

[32] Wu, W. B. and W. Min (2005). On Linear Processes with Dependent In- 
novations. Stochastic Processes and their Applications 115, 939 958. 

[33] Wu, W. B. and Z. Zhao (2008). Moderate deviations for stationary pro- 
cesses. Statistica Sinica 18, 769-782. 



24 



